Skip to content

feat: agent team#976

Closed
lppp04808 wants to merge 78 commits intosipeed:mainfrom
lppp04808:main
Closed

feat: agent team#976
lppp04808 wants to merge 78 commits intosipeed:mainfrom
lppp04808:main

Conversation

@lppp04808
Copy link
Contributor

@lppp04808 lppp04808 commented Mar 2, 2026

📝 Description

Implements a multi-agent Teams architecture for PicoClaw, enabling the coordinator agent to decompose complex tasks and delegate to specialized sub-agents running concurrently as goroutines.

Key features added:

  • team tool: orchestrate multi-agent pipelines with sequential, parallel, dag, and evaluator_optimizer strategies
  • spawn_sub_agent tool: delegate a single isolated task to a sub-agent
  • Heterogeneous Agents: per-member model field allows routing tasks to different LLMs (e.g., vision model for screenshots, code model for generation)
  • Model tags: annotate models with capability labels (vision, code, fast, long-context, reasoning, image-gen) to guide the coordinator's routing decisions
  • Auto-Reviewer: members can declare produces: "code"/"data"/"document" to trigger an automatic QA reviewer after all workers complete
  • Soft token budget: budget exhaustion injects a graceful wrap-up signal instead of hard-failing all workers
  • Truncation recovery: detects finish_reason=length and injects a retry message instead of looping on malformed JSON

🗣️ Type of Change

  • ✨ New feature (non-breaking change which adds functionality)

🤖 AI Code Generation

  • 🛠️ Mostly AI-generated (AI draft, Human verified/modified)

🔗 Related Issue

N/A

📚 Technical Context

  • Reference URL: N/A
  • Reasoning: The Teams architecture enables PicoClaw to handle complex multi-step tasks (e.g., React→Vue migration, full-stack feature development) by decomposing them into specialized sub-agents that run concurrently where possible, with explicit dependency management via DAG strategy.

🧪 Test Environment

  • Hardware: PC
  • OS: Linux
  • Model/Provider:
  • Channels: CLI

☑️ Checklist

  • My code/docs follow the style of this project.
  • I have performed a self-review of my own changes.
  • I have updated the documentation accordingly.

lppp04808 and others added 25 commits February 28, 2026 13:45
- Added 'team' and 'spawn_sub_agent' tools to support Coordinator-Worker patterns.
- Implemented execution strategies: Sequential, Parallel, DAG, and Evaluator-Optimizer.
- Added global token budget tracking via 'RemainingTokenBudget' for team cost control.
- Designed 'ConcurrencyUpgradeable' interface and 'ConcurrentFS' wrapper for opt-in, thread-safe file operations during parallel/DAG runs.
- Tested file locking mechanisms preventing race conditions during high concurrency.
- Added 'model' property to 'team' and 'spawn_sub_agent' JSON schemas.
- Modified 'buildWorkerConfig' to override 'baseConfig.Model' when a specific LLM model is requested by the coordinator.
- Allows teams to dynamically mix and match specialized vision, coding, and logical models within the same execution loop.
- Added ModelTag* constants (vision, code, fast, long-context, reasoning) to subagent.go
- Added Tags []string to config.ModelConfig (json:"tags,omitempty")
- Piped tags from config through FallbackCandidate and into SubagentManager.allowedModels
- Changed ResolveCandidatesWithLookup lookup signature to return (string, []string, bool) to carry tags
- Added ModelCapabilityHint() that generates rich per-model routing guidance for the LLM
- Dynamically injected capability hints into TeamTool and SpawnSubAgentTool descriptions
- Fixed fallback_test.go and subagent test files to match updated signatures
- context.go: add Rule sipeed#5 'Team delegation' to system prompt — agents now
  instructed to proactively use 'team' for multi-step/multi-concern tasks
  instead of handling them inline
- team.go: add 'WHEN TO USE THIS TOOL' activation triggers to tool description;
  strengthen decomposition rules with domain-agnostic project-manager framing
- loop.go: call subagentManager.SetTools(agent.Tools) after full registry is
  built so sub-agents inherit 'team' tool for recursive hierarchical decomposition
- toolloop.go: replace hard token budget failure with soft graceful degradation
  (50% advisory warning, 0% wrap-up signal + final summary call);
  add truncation recovery for max_tokens cutoff (finish_reason=truncated)
- openai_compat/provider.go: detect truncated JSON tool calls and set
  FinishReason='truncated' instead of silently storing malformed raw args
- Added Produces string field to TeamMember struct
- Added 'produces' property to team tool JSON schema
- Added reviewerTaskTemplates map for code/data/document artifact types
- Added maybeRunAutoReviewer() helper that auto-injects a QA reviewer
  agent after all workers complete when any member declares a produces type
- Wired reviewer into sequential, parallel, and dag execution strategies
- evaluator_optimizer skipped (already has built-in critique loop)
@lppp04808
Copy link
Contributor Author

lppp04808 commented Mar 12, 2026

Team Tool Configuration Guide

The team tool allows PicoClaw to orchestrate multiple agents to complete complex tasks using various strategies (Sequential, Parallel, DAG, and Evaluator-Optimizer). This guide explains how to configure permissions, limits, and behavior.

Configuration Location

Settings are located in config.json under tools.team.

Comprehensive Example

{
  "tools": {
    "team": {
      "enabled": true,
      "max_members": 5,
      "max_team_tokens": 100000,
      "max_evaluator_loops": 3,
      "max_timeout_minutes": 20,
      "max_context_runes": 8000,
      "disable_auto_reviewer": false,
      "reviewer_model": "gpt-4o-mini",
      "allowed_strategies": [
        "sequential",
        "parallel",
        "dag",
        "evaluator_optimizer"
      ],
      "allowed_models": [
        {
          "name": "gpt-4o",
          "tags": ["vision", "code", "reasoning"]
        },
        {
          "name": "claude-3-5-sonnet",
          "tags": ["coding", "precise"]
        }
      ]
    }
  }
}

Field Reference

Field Type Description
enabled boolean Master toggle for the Team tool. (Required for the tool to appear).
max_members integer Hard limit on the number of agents in a team. Prevents runaway cost/scope.
max_team_tokens integer Hard ceiling for total token usage (Input + Output) across ALL members in one team call.
max_evaluator_loops integer For evaluator_optimizer strategy: Maximum retry attempts if the evaluator rejects the work.
max_timeout_minutes integer Maximum wall-clock time for the team operation before forced termination.
max_context_runes integer Max characters/runes injected when passing results from one worker to another (Dependency Context). Prevents context window overflows. Default is 8000.
disable_auto_reviewer boolean If true, skips the optional QA reviewer phase that normally triggers for "code" or "document" outputs.
reviewer_model string A specific model to use for the QA Reviewer step. Typically a cheaper/faster model (e.g., gpt-4o-mini) is recommended.
allowed_strategies string[] List of allowed strategies (sequential, parallel, dag, evaluator_optimizer). If omitted, all are allowed.
allowed_models object[] A strict allow-list of models the team can use. Tags defined here help the coordinator select the best model for a specific role.

Feature Highlights

1. Robust Parallelism

In Parallel mode, if some workers fail while others succeed, the tool returns Partial Success. Successful results are preserved, and failures are summarized separately so the coordinator can decide how to proceed.

2. Context Truncation

To prevent "Token Bombs," the tool automatically truncates long outputs when injecting them as dependencies/context for downstream workers. You can tune this via max_context_runes.

3. Dedicated QA Reviewer

The Auto-Reviewer automatically validates artifacts like code or documentation. By setting reviewer_model, you can perform this validation with a highly efficient model to save on latency and costs.

4. DAG Pipeline safety

The DAG strategy implements Cycle Detection (Kahn's Algorithm). If the LLM proposes a circular dependency, the tool will intercept and report it as an error before wasting tokens.

lppp04808 and others added 3 commits March 12, 2026 10:44
…entation

- Fixed dead code in token budget clamping
- Fixed goroutine leak in DAG strategy fast-fail path
- Optimized Evaluator to run without tools in evaluator_optimizer
- Added partial success support to Parallel strategy
- Implemented configurable context truncation via max_context_runes
- Added reviewer_model support for cost-efficient QA
- Added structured logging across all strategies
- Improved ForUser messages with role summaries
- Merged and updated team configuration documentation
@lppp04808
Copy link
Contributor Author

lppp04808 commented Mar 12, 2026

@imguoguo @Zepan @afjcjsbx

Hi team! 👋

Just wanted to highlight that this PR (#976: feat: agent team) is directly building on and extending the foundational multi-agent work outlined in Issue #294 ("Feature: Base Multi-agent Collaboration Framework & Shared Context"), which is still marked as In Progress and is a key part of the Roadmap's Advanced Capabilities.

This PR implements exactly that missing layer:

  • Full multi-agent team orchestration with 4 strategies (Sequential, Parallel, DAG, Evaluator-Optimizer)
  • Concurrent goroutine execution for efficiency
  • spawn_sub_agent tool + model-tag routing
  • Shared context with configurable truncation/recovery
  • Soft token budgets, partial success handling, reviewer_model for QA
  • Structured logging, detailed docs, and recent fixes (goroutine leaks, dead code, etc. — latest push just now)

It stays compatible with the existing sub-agent foundation (#409/#423) while adding reliable coordination for complex decomposed tasks — precisely what #294 aims to enable as the base for community agents (Coder, Researcher, etc.), swarm mode, and beyond.

No conflicts, CI green, rebased on main.

Would really appreciate any initial feedback or eyes from the core team on:

Super happy to iterate further (add tests, break into smaller PRs if needed, or discuss details). This feels like a natural progression to help close #294 and unlock the full multi-agent potential in PicoClaw! 🚀

Thanks for the incredible momentum (v0.2.2 release yesterday, constant merges) — excited to contribute to this growth.

Best regards,
@lppp04808

lppp04808 and others added 12 commits March 12, 2026 13:28
- Merge feat/team: integrate multi-agent orchestration tool into main.
- Fix Team Tool bugs: resolve DAG goroutine leaks and token budget clamping.
- Optimize Team strategies: implement text-only evaluator and partial parallel success.
- Enhance observability: add structured logging and improved user summaries.
- Expand configuration: add max_context_runes and reviewer_model settings.
- Refactor context builder: implement cached system prompt and simplified dynamic context.
- Align with upstream: remove local Embedding support and custom truncation logic for better parity.
- Documentation: update tools_configuration.md with detailed Team tool guide.
- Update TeamTool to accept and use global configuration
- Refactor buildWorkerConfig into a TeamTool method for better context handling
- Improve error handling in DAG and sequential execution flows
- Add comprehensive unit tests for TeamTool sequential execution
- Clean up imports and formatting in toolloop.go and config.go
- Retain  and  during multi-turn tool calls to resolve API 400 errors (missing a thought_signature).
- Append the final assistant message to the session history before exiting the loop, fixing the state memory loss issue in the  team strategy.
- Add fallback to  when standard content is empty (e.g., for Gemini 2.0 Pro Thinking).
- Preserve  during token budget exhaustion and truncation recovery to prevent broken chain-of-thought.
@lppp04808
Copy link
Contributor Author

Hi,

Thank you for the thoughtful and focused proposal in #1439!
The emphasis on "What this does NOT propose" is especially helpful — it keeps the scope clearly on clarifying boundaries, fixing correctness issues (like orphaning, inaccurate estimation, reactive behavior), and shifting to proactive checks without introducing new abstractions or features.

While working on PR #976 (multi-agent team orchestration), I encountered several of the same context/token pain points in concurrent multi-agent scenarios and applied some patch-style fixes directly on the existing loop to improve reliability. These are not new mechanisms but simple adjustments that might help illustrate or validate the boundary clarifications and correctness improvements outlined in #1439.

Here are the relevant existing fixes in #976 that align with the proposal's goals:

  • Soft token budget handling (proactive awareness instead of abrupt failure)
    Tracks remaining tokens per worker and team level, triggers a 50% advisory warning followed by a 0% wrap-up signal + final summary call to allow graceful completion rather than hard failure.
    Files: toolloop.go / team.go
    → This supports the shift from reactive to proactive budgeting and the history_budget concept for more predictable exhaustion behavior.

  • Configurable injected context length limit (clearer truncation boundaries)
    Introduces max_context_runes (default 8000) to cap the length of context passed to downstream agents, preventing overflow in inter-agent communication.
    Files: team.go / context.go
    → Aligns with defining explicit truncation boundaries and avoiding unintended overflow as described.

  • Truncation & finish_reason recovery (safe recovery from limits)
    Detects finish_reason=length or truncated responses and injects retry/fallback messages to prevent malformed outputs or broken loops.
    Files: provider.go (OpenAI compat) / toolloop.go
    → Complements the proactive pre-call check and safe recovery paths, helping ensure correctness after hitting limits.

  • Preservation of critical context elements during truncation/budget hits
    Ensures thought_signature, session history, and chain-of-thought remain intact even when truncation or budget exhaustion occurs.
    Files: toolloop.go / context.go
    → Supports the tool-pair awareness goal (avoiding cuts in the middle of atomic groups) and maintains continuity without new structures.

  • Partial success in Parallel execution
    Retains successful worker results even if others fail, allowing partial progress.
    → Indirectly aids in avoiding complete loss of valid context segments.

All of these are straightforward patches on the current agent loop (50 commits total, CI green, rebased on main), not new features. They were added to make multi-agent flows more robust in practice.

Once the boundary clarifications and correctness fixes from #1439 are applied in the refactor/agent branch, I'm happy to:

  • rebase and adapt these adjustments to the updated context model
  • contribute unit tests specifically targeting safe boundaries, non-orphaning tool pairs, accurate estimation, and proactive timing
  • help validate that the proposed changes reduce wasted calls and improve correctness in multi-agent scenarios

If any of these existing patches can serve as a quick reference for testing the boundary definitions or proactive shifts (e.g., does the soft budget reduce abrupt failures? do the truncation rules prevent orphaning?), feel free to review them — I'd be glad to point to specific sections, share diffs, or assist in porting them.

Thanks again for driving this important track forward — excited to see the context foundation become more solid and predictable!
Any thoughts on priorities, potential conflicts with these patches, or next steps? Open to all feedback.

@lppp04808
Copy link
Contributor Author

Closing this PR (#976) as we have now opened a clean v2 version:

New PR → feat(tools): restore and enhance team tool with SubTurn integratio

Reason for closing:

  • Original feat: agent team #976 was built on the pre-Phase-1 architecture (direct RunToolLoop calls)

  • v2 has been fully merged and significantly upgraded:

    • Full integration with SubTurnSpawner (isolated sessions, depth limit, concurrency semaphore, timeout, automatic truncation recovery)
    • ConcurrentFS + cross-member token budget tracking
    • Stateful iteration support for evaluator_optimizer strategy
    • Updated documentation and tests
  • Perfectly aligns with the explicit task coordination part of Meta: Agent Refactor Phase 2 - Multi-Agent Collaboration within Single Pico #1934 Phase 2
    (Agent Discovery / implicit collaboration is deferred to a future PR)

This PR has served its purpose. v2 completely supersedes #976.

Thank you very much for the previous reviews on #976!
All future iterations will happen in the new PR. Looking forward to your feedback there~

@lppp04808 lppp04808 closed this Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants